Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
A key challenge in e-learning environments like Intelligent Tutoring Systems (ITSs) is to induce effective pedagogical policies efficiently. While Deep Reinforcement Learning (DRL) often suffers from \textbf{\emph{sample inefficiency}} and \textbf{\emph{reward function}} design difficulty, Apprenticeship Learning (AL) algorithms can overcome them. However, most AL algorithms can not handle heterogeneity as they assume all demonstrations are generated with a homogeneous policy driven by a single reward function. Still, some AL algorithms which consider heterogeneity, often can not generalize to large continuous state space and only work with discrete states. In this paper, we propose an expectation-maximization(EM)-EDM, a general AL framework to induce effective pedagogical policies from given optimal or near-optimal demonstrations, which are assumed to be driven by heterogeneous reward functions. We compare the effectiveness of the policies induced by our proposed EM-EDM against four AL-based baselines and two policies induced by DRL on two different but related tasks that involve pedagogical action prediction. Our overall results showed that, for both tasks, EM-EDM outperforms the four AL baselines across all performance metrics and the two DRL baselines. This suggests that EM-EDM can effectively model complex student pedagogical decision-making processes through the ability to manage a large, continuous state space and adapt to handle diverse and heterogeneous reward functions with very few given demonstrations.more » « less
-
This study introduces a general approach for generating fuzzy logic rules in regression tasks with complex, high-dimensional input spaces. The method leverages the power of encoding data into a \emph{latent} space, where its uniqueness is analyzed to determine whether it merits the distinction of becoming a noteworthy exemplar. The efficacy of the proposed method is showcased through its application in predicting the acceleration of one of the links for the Unimation Puma 560 robot arm, effectively overcoming the challenges posed by non-linearity and noise in the dataset.more » « less
-
Deep Reinforcement Learning (Deep RL) has revolutionized the field of Intelligent Tutoring Systems by providing effective pedagogical policies. However, the ``black box'' nature of Deep RL models makes it challenging to understand these policies. This study tackles this challenge by applying fuzzy logic to distill knowledge from Deep RL-induced policies into interpretable IF-THEN Fuzzy Logic Controller (FLC) rules. Our experiments show that these FLC policies significantly outperform expert policy and student decisions, demonstrating the effectiveness of our approach. We propose a Temporal Granule Pattern (TGP) mining algorithm to increase the FLC rules' interpretability further. This work highlights the potential of fuzzy logic and TGP analysis to enhance understanding of Deep RL-induced pedagogical policies.more » « less
-
Intelligent Tutoring Systems (ITSs) leverage AI to adapt to individual students, and many ITSs employ \emph{pedagogical policies} to decide what instructional action to take next in the face of alternatives. A number of researchers applied Reinforcement Learning (RL) and Deep RL (DRL) to induce effective pedagogical policies. Much of prior work, however, has been developed \emph{independently} for a specific ITS and \emph{cannot directly be applied to another}. In this work, we propose a \textbf{M}ulti-\textbf{T}ask \textbf{L}earning framework that combines Deep \textbf{BI}simulation \textbf{M}etrics and DRL, named \textbf{MTL-BIM}, to induce a unified pedagogical policies for two different ITSs across different domains: logic and probability. Based on empirical classroom results, our unified RL policy performed significantly better than the expert-crafted policies and independently induced DQN policies on both ITSs.more » « less
-
In deductive domains, three metacognitive knowledge types in ascending order are declarative, procedural, and conditional learning. This work leverages Deep Reinforcement Learning (\textit{DRL}) in providing \textit{adaptive} metacognitive interventions to bridge the gap between the three knowledge types and prepare students for future learning across Intelligent Tutoring Systems (ITSs). Students received these interventions that taught \textit{how} and \textit{when} to use a backward-chaining (BC) strategy on a logic tutor that supports a default forward-chaining strategy. Six weeks later, we trained students on a probability tutor that only supports BC without interventions. Our results show that on both ITSs, DRL bridged the metacognitive knowledge gap between students and significantly improved their learning performance over their control peers. Furthermore, the DRL policy adapted to the metacognitive development on the logic tutor across declarative, procedural, and conditional students, causing their strategic decisions to be more autonomous.more » « less
An official website of the United States government

Full Text Available